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Abstract 



We consider Stochastic Volatility processes with heavy tails and possible long memory 
in volatility. We study the limiting conditional distribution of future events given that 
f-H I some present or past event was extreme (i.e. above a level which tends to infinity). Even 

■ though extremes of stochastic volatility processes are asymptotically independent (in the 

(-H I sense of extreme value theory), these limiting conditional distributions differ from the 

i.i.d. case. We introduce estimators of these limiting conditional distributions and study 
their asymptotic properties. If volatility has long memory, then the rate of convergence 
and the limiting distribution of the centered estimators can depend on the long memory 
parameter (Hurst index). 



^ ■ 1 Introduction 

' One of the empirical features of financial data is that log-returns are uncorrelated, but their 

. squares, or absolute values, are dependent, possibly with long memory. Another important 

feature is that log-returns are heavy-tailed. There are two common classes of processes to 
model such behaviour: the generalized autoregressive conditional heteroscedastic (GARCH) 
process and the stochastic volatility (SV) process; the latter introduced by Brcidt ct al. [1998] 
^ I and Harvey [1998]. The former class of models rules out long memory in the squares, while the 

' latter allows for it. We will therefore concentrate in this paper on the class of SV processes, 



which we define now. 

Let {Yj,j G Z} be the observed process (e.g. log-returns of some financial time series), and 
assume that it can be expressed as 

Y, = a{Xj)Z, . (1) 

where a is some (possibly unknown) positive function, {Zj,j S Z} is an i.i.d. sequence and 
G Z} is a stationary Gaussian process with mean zero, unit variance, autocovariance 
function {"fn}, and independent from the i.i.d. sequence. The sequence cr{Xj) can be seen as 
a proxy for the volatility. We will assume that either {Xj} is weakly dependent in the sense 
that 

5Zl7il<oo, (2) 
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or that it has long memory with Hurst index H G (1/2, 1), i.e. 

7„ = cov(Xo, Xn) = n2^-2£(n) (3) 
where £ is a slowly varying function. 

Furthermore, we assume that the marginal distribution Fz of the i.i.d. sequence {Zj} has a 
regularly varying right tail with index a > 0, i.e., for all positive y, 

lim P(Z >ty\Z>t)= lim = y"" . (4) 

t— >oo t— >oo Fz(t) 

Examples of heavy tailed distributions include the stable distributions with index a G (0, 2), 
the t distribution with a degrees of freedom, and the Pareto distribution with index a. 

By Breiman's lemma Brciman [1965], Resnick [2007], if E[(T°"'"'^(X)] < oo for some e > 0, then 
the marginal distribution of {Yj} also has a regularly varying right tail with index a and 

x->oo 1P(^Z > X) 

where X, Y and Z denote random variables with the same joint distribution as Xq, Iq find Zq. 

Estimation and test of the possible long memory of such processes has been studied by 
Hurvich ct al. [2005]. Estimation of the tail of the marginal distribution by the Hill estimator 
has been studied in Kulik and Soulier [2011]. 

In this paper we are concerned with certain extremal properties of the finite dimensional 
joint distributions of the process {Yj} when Z is heavy tailed and the Gaussian process {Xj} 
possibly has long memory. 

From the extreme value point of view, there is a significant distinction between the GARCH 
and SV models. In the first one, exceedances over a large threshold are asymptotically depen- 
dent and extremes do cluster. In the SV model, exceedances are asymptotically independent. 
More precisely, for any positive integer m, and positive real numbers x, y, 

lim tF{Yo > a{t)x , Y^ > a(t)y) = , (6) 

where a{t) = F^{1 — 1/t) and F^ is the left continuous inverse of Fz- This holds since it 
can be easily shown by a conditioning argument that 

p(yo > i , > t) ~ c X p(yo > ^)^ * ^ oo , (7) 

for some positive constant c. 

The above observations may lead to the incorrect conclusion that, for the SV process, there 
is no spillover from past extreme observations onto future values and from the extremal 
behaviour point of view we can treat the SV process as an i.i.d. sequence. However, under 
the assumptions stated previously, it holds that 
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II D data 



SV data 




Figure 1: Empirical Conditional Distribution (points) and Empirical Distribution (solid line) 
for SV model (right panel) and i.i.d. data (left panel) 

Therefore, the limiting conditional distribution is influenced by the dependence structure of 
the time series. To illustrate this, we show in Figure 1 estimates of the standard distribution 
function and of the conditional distribution for a simulated SV process. Clearly, the two 
estimated distributions are different, as suggested by (8). For a comparison, we also plot the 
corresponding estimates for i.i.d. data. Other kind of extremal events can be considered, 
for instance, we may be interested in the conditional distribution of some future values given 
that a linear combination (portfolio) of past values is extremely large, or that two consecutive 
values are large. As in Equation (8), in each of these cases, a proper limiting distribution can 
be obtained. To give a general framework for these conditional distributions, we introduce a 
modified version of the extremogram of Davis and Mikoscli [2009]. For fixed positive integers 
h < m and h' > 0, Borel sets A C and B C MJ^ we are interested in the limit denoted 
by p{A, B, m), if it exists: 

p{A, B,m) = \imF{{Ym,..., Y^+f) € B \ (Yi, . . . ,Yh) e tA) . (9) 

I— ^oo 

The set A represents the type of events considered. For instance, if we choose A = {{x, y, z) € 
[0, oo)^ \ x + y + z>l}, then for large t, {(Y_2, l^-i, Yq) G tA} is the event that the sum of 
last three observations was extremely large. The set B represents the type of future events 
of interest. 

In the original definition of the extremogram of Davis and Mikosch [2009] , the set B is also 
dilated by t. This is well suited to the context of asymptotic dependence, as arises in GARCH 
processes. But in the context of asymptotic independence, this would yield a degenerate limit: 
if /i < m, then for most sets A and B, 

lim P((y^, . . . , Yrn+h') £tB\iY,,...,Yh)£tA)=0. 
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The general aim of this paper is to investigate the existence of these Umiting conditional 
distributions appearing in (9) and their statistical estimation. The paper is the first step 
towards understanding conditional laws for stochastic volatility models. Although we provide 
theoretical properties of estimators, their practical use should be investigated in conjunction 
with resampling techniques. This is a topic of authors' current research. 

The paper is structured as follows. In Section 2, we present a general framework that enables 
to treat various examples in a unified way. In Section 3 we present the estimation procedure 
with appropriate limiting results. 

The proofs are given in Section 4. In the Appendix we collect relevant results on second order 
regular variation, (long memory) Gaussian processes, and criteria for tightness. 

We conclude this introduction by gathering some notation that will be used throughout the 
paper. We denote convergence in probability by — )-p, weak convergences of sequences of 
random variables or vectors by -^d and weak convergence in the Skorokhod space 'D(R'^) of 
cadlag functions defined on endowed with the Ji topology by =^>. 

Boldface letters denote vectors. Product of vectors and inequalities between vectors are taken 
componentwise: u • v = {uivi, . . . , UdVd)', x < y if and only if Xi < m for alH = 1, . . . , d. The 
(multivariate) interval (oo,y] is defined accordingly: (oo,y] = Y]^=i{—oo, yi\. 

For any univariate process {.^j} and any integers h < h', let ^/ denote the {h' — h + 1)- 
dimensional vector {^h, ■ ■ ■ ,Ch')- 

For yl C E'^ and u G (0, 00)"*, u"^ • ^ = {x G R'^ | u • x G ^}. 

If X is a random vector, we denote by L^(X) the set of measurable functions / such that 
E[|/(X)|f] <oo. 

For any univariate process {^j} and any integers h < h', let ^/ denote the [h' — h + 1)- 
dimensional vector {^h-, ■ ■ ■ -.ih')- 

The (T-field generated by the process {Xj} is denoted by X. 



2 Regular variation on sub cones 

Since we considered dilated sets tA, where A C for some integer /i > 0, it is natural to 
consider cones, that is subsets C of [0, 00]'* such that tx ^ C for all x G C and t > 0. The next 
definition is related to the concept of regular variation on cones of Rcsnick [2008] . We endow 
M.^ with the topology induced by any norm and [0,oo]'' is the compactification of [0,oo)''. A 
subset A of [0, 00]'* \ {0} is relatively compact if its closure is compact. See Rcsnick [1987] for 
more details. We first state a general assumption and will give examples afterwards. 

Assumption 1. Let h be a fixed positive integer. Let C be a subcone of [0, 00]'* \ {0} such 
that, (i) for all relatively compact subsets A of C and all u G (0, 00)'', • A is relatively 
compact in C, and (ii) there exists a function gc and a non degenerate Radon measure uq on 
C such that 

t^oo gc[Fz{t)) 
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Note that in the case h = 1, the cone C = (0, oo) and Assumption 1 is nothing more than the 
regular variation of the tail of Zi . 

Assumption 1 implies that the function gc is regularly varying at with index /3c G (0, oo) 
and the measure vc is homogeneous with index —a/3c. For s > 1, define 

^ ' 9c{Fzit)) 
Next, Assumption 1 implies that for all u G (0,oo)'^, it holds that 

lim 7— — = z/c{u ■ A) . 

gc{Fz{t)) ' 

This convergence implies that there exists a function Mj^ such that for all u G (0,oo)'', 

P(u -Zih^tA) 
sup ^ ^ < M^(u) . (11) 

i>i 9c{Fz(t)) 

Hence, if E[M4(cr(Xi ^J)] < oo, by bounded convergence, we have 

1- IP('^(Xi,/.) • Zi,;, £ M) 1 
t^oo gc{Fz[t)) 

For /i = 1, and A = (l,oo). Potter's bound imply that (11) holds with Ma{u) = Cu"''^^ for 
some constant C, i.e. 

.,p5(|2ii^<c„-. (12) 

t>i ^z[t) 

For example, for m > h and /i' > 0, and for any Borel measurable set B C M'^'"'"^, we have, 
by the same bounded convergence argument 

P(Yi,, £ M _Y £ i?) ^ ^ [^^(^(x,^,)-i . A)P(Y™„^+,, G B I Af)] . 

t^oo gc{Fz{t)) 

If E[i/c('''(Xi^/^)~"'^ • A)] > (which in examples is seen to hold as soon as fci^) > 0), we 
obtain that the extremogram defined in (9) can be expressed as 

p{A,B,m) = lim F{Y^^^+h' e B \ Y^^h e tA) 

_ E [l^c{(T{Xi,h)-'A)F{Yra,rn+h' ^ B \ X)] 

E[,.cicT{^i,h)-' ■ A)] ■ ^ ^ 

We will consider the following type of cones. For j G {0, 1}'', let Cj denote the cone defined 

by 

Cj = {z G [0,^f^ I { n > 0} • (14) 

r.jt=0 i,ji=l 

In words, a vector z G Cj if at least one of its entries corresponding to the components of j 
equal to zero is positive, and all of its entries corresponding to the components equal to one 
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of j are positive. For h = 1, the only cone is (0,oo] and we will denote it Cq for consistency 
of the notation. 

A subset A is relatively compact in Cj if and only if there exists r/ > such that X^i j =o ■^ji ^ ^ 
and Zj. > r] for all i such that ji = 1. 

For example, if /i = 3 and j = (0,0, 1), then Cj = {[0,oo] x [0,oo] \ {(0,0)}) x (0,oo], and A 
is a relatively compact subset of C(o,o.i) if there exists e > 0, such that {zi, Z2, zs) E A implies 
zi > e or Z2 > e, and Z3 > e. 

Denote |j| = ji + • • • + jVi, i-e. the number of non zero components in j. Then, there exists a 
non zero Radon measure i^j on Cj such that for each relatively compact set A £ C^, 

The measure v-, can be described more precisely. 



z.j(dz) = al^l+i zjr'^ni^zM n ^jr'^^n^ 

yi:ji=Q J i:ii = l 

where 5j is Lebesgue's's measure on the j-th coordinate axis, i.e. for any non negative 
measurable function (/>, 



r poo 

/ ^iz)6jidz) = / ^iO,...,z,,...,0)dzj 

JlO.oo]'^ Jo 



'[0,00]'' Jo 

Moreover, for any relatively compact subset A of Cj, and for any e > 0, there exist 77 > and 
a constant C (which both depend on A) such that, for all u G (0, oo)'*, 

P(uZi,/, G tA) ^ P(U,;,-,=o{n,-,^j-; > 7?} nni:,,=i{u,,Z,-, > 7]}) 



Fz{u)\i 



+1 



Fz(u)ljl+i 

i:ji=0 I j:ii = l 



(15) 



Thus (11) holds and if 



E 



i:ji = l 



< 00 



(16) 



then, cf. (13), 



lim F{Y^,^+h' G -B I Yi,^ G t^) 



E [i/j(<T(Xi,fe)-M)P(Y^,^+fe' £ i? I ^) 
E[z.j(a(Xi,,,)-M)] 



Remark 1. We assume that h < m. Otherwise, if m < /i, then vectors Ym,m+h' and Yi^/j 
may be asymptotically dependent. For example, if {Zj} is i.i.d with the tail distribution as 
in (4), then P(Z2 + Z3 > t | Zi + Z2 > t) 1/2. We do not think that this is of particular 
interest, since one is primary interested in estimating distribution of future vector 'Ym.m+h' 
based on the past observations Yi 
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Remark 2. The cones Cj are the only ones such that ■ A C C for all u G (0, oo)'* and every 
A C C. This assumption can be relaxed and other cones could be considered if a is bounded 
above and away from zero, but this is not a desirable assumption since for instance it rules 
out the case a{x) = e^. 

Remark 3. Consider for example cr{x) = exp(x). Assumption (16) is fulfilled for arbitrary 
(weak and strong) dependence structure of {-^j}- The same holds for many moment assump- 
tions which appear in the paper. 

2.1 Examples 

Example 1. Fix some positive integer h and consider the cone Ci = {0,oo)^. Then (10) holds 
with gh{t) = and Vh defined by 



1=1 



Consider the set A defined by ^ = {{zi, . . . ,Zh) G M't \ zi > 1, . . . , z^ > 1}. If 



E 



.1=1 



< oo 



for some e > 0, we obtain, for m > h, and B € M'^'^^, 



lim ¥{Y^^rn+h' eB\Yi>t,...,Yh>t) 

t— >-oo 



E 




eB\x) 


E 







In particular, setting B = (— oo,?/] and h' = 0, the limiting conditional distribution of 1^ 
given that Yi, . . . ,Yh are simultaneously large is given by 



^h{y) = lim F{Ym <y\Yi>t,...,Yh>t) 



E 


n-=i<T"(X,)Fz(y/a 




E 







(17) 



Example 2. Consider again the case Ci = (0, oo). Another quantity of interest is the limiting 
distribution of the sum of h' consecutive values, given that past values are extreme. To keep 
notation simple, consider h' = 1 and, for m > 1, 



^'*(y) = lim F{Y^ + F^+i <y\Yi>t) 

t— >oo 



EK(Xi)F(y„, + K^+i <y I ;f)] 

E[a"{Xi)] 



Estimating this distribution yields for instance empirical quantiles of the sum of future returns, 
given the present one is large. 

Example 3. Consider the cone Co,o = [0,oo) x [0,oo) \ {0}. Then (10) holds with go,o{t) = t 
and z^o,o defined by 



z^o,o(d^i,d2;2) = a{5(o,oo]x{o}^i ° dzi + (5{o}x{o,oo]^2 ° ^^2} ■ 
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The bound (11) with Mji{u,v) = C{u°^~^'' + v°^~^'') for some constant C. Consider the set A 
defined hy A = {(zi, 22) G | zi + Z2 > !}• If E[fj"+''(Xi)] < 00 for some e > 0, we obtain 

^hm P(Y„,,^+,, ^B\Yr + Y,>t)- EK(Xi)] + EK(X2)] " 

In particular, take i? = (—00, y\ and /i' = 0. The limiting conditional distribution of Y^ given 
Yi + 12 is large is defined by 

A(.) = lim F(y„ <y\Y, + Y,>t)- ^"^"^^^^ + 



Example 4. We can combine the previous examples. Consider A = {(^^i, Z2, z^) € MS|_|zi + 2;2 > 
1, 2:3 > 1}. We may obtain for instance, for m > 3, 

lim ¥{Y^^^+h' (^B\Y^ + Y2>t,Y^>t) 



t—>-oo 



^ E [P(Y^,^+fe, £ B I X){a"{Xi) + a^{X2)}a^iX^)] 
E[{a"(xi)+a"(X2)}c7-(X3)] 

if E[{o-"+^(Xi) + o-"+^(X2)}ct"+"(X3)] < 00 for some e > 0. The relevant cone is Co,o,i, 
5'o,o,i(*) = and the associated measure on Co,o,i is defined by 

2^0,0,1 = «^{Voo]x{0}% + (5{0}x(0,oop2'"~"^dz2}z3"""M23 . 



3 Estimation 

To simplify the notation, assume that we observe Yi, . . . , Yn+m+h'- An estimator Pn{A, B, m) 
is naturally defined by 

En -j 1 
" , /, « , _ i=i {Yj,j+fe-iey(n:n--fc)^} {Yj+„i,j+„i+fe/£g} 

Pn\^-, ni) — , 

^i=i {Yj,j+h_ieY(,i:„_fc)A} 

where /c is a user chosen threshold and y(„:i) < • • • < ^(nin) the increasing order statistics 
of the observations li, . . . ,1^. We will also consider the case B = (— oo,y], i.e. the case of 
the limiting conditional distribution of m+h' given Yi G tA, i.e. 

^A,m,h'{y) = lim IP(Y < y I Yi,;, G U) 

t— >oo 

... 1 ^ E[z.c(^(Xi,fc)-^ • A) nti F{y,/a{X^+,))] 
= piA, ioo,y],m) = ie[.c(^(X,.)- • ^)] ' ^''^ 

An estimator ^n,A,m,h' of ^A,m,h' is defined on by 

En 1 -| 
,1, , , i=l {Yjj+h-i6y(„:„_fc)A} {Yj+„ j^,„_^,,/<y} 

Vn,A,m,/i'(,yJ = v=^n ^ • [^^) 

2^j=i -'-{Yj,j+h-iey(„:„_fc)A} 

In order to obtain statistical results, we need additional assumptions. We first state two 
assumptions which will be needed to prove the weak convergence of a multivariate conditional 
empirical process. 
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Assumption 2. For j = 1, . . . ,h, there exist functions Cj such that for all s, s' > 1, u, v G 

(0,oo)\ 

P(u • Zi,fe £ tsA, V • Zjj+h-i £ ts'A) 
hm - = Cj{A,u,v,s,s) . (20 

For j = 1 we only need that (20) holds with u = v. If ^ is a cone, then (20) holds for j = 1 
with Ci{A, u, u, s, s') = Tc{s V s')!/c(u~-'^ • ^) as an immediate consequence of Assumption 1. 
It may happen that Cj{A, •) = for j = 2, . . . ,h. Intuitively, this happens if u • Zi and 
V • Zjj_|_/i_i belong simultaneously to tA implies that at least h + 1 coordinates of Zi^h+j-i 
are large. This is the case for instance for Examples 1 and 4. Actually, Assumption 2 holds 
for the cones Cj, but a precise description of the functions Cj when they are not identically 
zero would be extremely involved. This will only be done for Example 3. See Section 3.3. 

By Cauchy-Schwartz inequality, if Assumptions 1 and 2 hold, then, for s,s' > 1, 



£j (u, V, sA, s'A) < VMa(u)Ma(v) . 

Thus, if E[M^(cr(Xi^/j))] < oo, then the convergence in (20) is also in L^(cr(Xi /i), a{X.jj^h_i)). 

The next assumption is needed for the quantities (that will appear in the limiting distribu- 
tions) to be well defined and to use bounded convergence arguments. 

Assumption 3. E[M^(cr(Xi_/j))] < oo. 

As usual, the bias of the estimators will be bounded by a second order type condition. Let 
k he a non decreasing sequence of integers, let Fy denote the distribution of Y and let 
Un = {1 / Fy)'^ {n / k) . Consider the measure defined on the Borel subsets of C by 

EK(>t(Xi,,)-1-A)] 

^'^"^^ - (E[a"(A)])/5c • (21) 

We introduce a rate of convergence: 

P(Yi,;, G UnSA I X) 



Vn{A) = E 



sup 

s>l 



gc{k/n) 

Lemma 1. Under Assumption 1 and 3, lim„^oo 'yn(^) = 0. 



Tc{s)iJic{A) 



(22) 



We need also the following quantities, which are well defined under Assumptions 1, 2 and 3. 
For j = 2, . . . ,h and measurable subsets i?, B' of M'' define 

nj{A,B,B') 

^ E[£(A,cr(Xi,fe),cr(X,-,+fe_i),0,0) X FjY^^rn+h' € B,Y^+j_i^rn+h'+j-i € B' \ X)] 

iE[a'^{X)])-PcK[uc{a{Xi^h)~' ■ A)] 
E[£(^,cr(Xi,fe),cr(X,-,+fe„i),0,0) x F{Y^^^+h' £ ^^ Y^+,_i,^+fe.+,-_i e B \ X)] 

{E[a'^{X)])-l^cE[uc{a{X,,h)-^ ■ A)] 

(23) 



+ 



For brevity, denote TZjiA, B) = lZj{A, B, B). 



3.1 General result: weak dependence 



We can now state our main result in the weak dependence setting, i.e. when absolute summa- 
bility (2) of the autocovariance function of the process {Xj} holds. 

In order to simplify the proof, we make an additional assumption. 
Assumption 4. If s < t then tA C sA. 

This assumptions holds for all the examples considered here and most common examples. 

Theorem 2. Let Assumptions 1, 2, 3, 4 and the weak dependence condition (2) hold. Assume 
moreover that fJ-ci^) > 0, k/n ^ 0, ngc{k/n) — t- oo and 

Wui ngc{k/n)vn{A) = {) . (24) 

Then 

^/ngc{k/n)fic{A){pr,{A, B, m) - p{A, B, m)} 
converges weakly to a centered Gaussian distribution with variance 

piA,B,m){l-p{A,B,m)} 

h/\{m—h) 

+ {nj{A,B)-2p{A,B,m)nj{A,B,W^'+^) + p^{A,B,m)nj{A,Ml''+^)] . (25) 

i=2 

Remark 4. If /i = 1 or if the functions Cj defined in Assumption 2 are identically zero for 
j > 2, then the limiting covariance in (25) is simply p[A,B,m){l — p{A, B,m)}. 

Otherwise, the additional terms can be canceled by modifying the estimator of pn{A, B,m). 
Assuming we have nh + m + /i' + 1 observations, we can define 

En -. 
~ , . „ , i = l {Y(j„l)h+l,-,7iGy{n:n-fc)-4} { _ y _ GB} 

Pn[A,B,m) = — 

Noting that the events {Yj j^h^i G A} are /i-dependent conditionally on X, the proof of The- 
orem 2 can be easily adapted to show that the limiting variance of ngc{k / n){pn{A, B, m) — 
p(j4, i?, m)} is the same as in the case where Cj = for j = 2, . . . ,h. But this is of course at 
the cost of an increase of the asymptotic variance, due to a different sample size. 

We can also obtain the functional convergence of the estimator ^n,A,m,h' of the limiting 
conditional distribution function ^A,m,h', defined respectively in (19) and (18). 

Corollary 3. Under the Assumptions of Theorem 2, and if moreover the distribution ^A,m,h' 
is continuous, then 

\/ngc{k/n)nc{A){'^n,A,m,h' - ^A,m,h'} 

converges in D{M^'^^) to a Gaussian process. If h = 1 or if the functions Cj are identically 
zero for j = 2, . . . ,h, then the limiting process can be expressed as B o ^A,m,h'; where B is the 
standard Brownian bridge. 

Note that a sufficient condition for ^a m h' to be continuous is that Fz is continuous. 
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3.2 General result: long memory 

We now state our results in the framework of long memory. This requires several additional 
notions, such as multivariate Hermite expansion and Hermite ranks which are recalled in 
Appendix B. 

Define the functions G„ and G for (x,x') G x M'*'+^ and s > 1 by 

Gn{A, B,s,x,x) = —Y- P(cr(x ) • Zm..m+h' € B) (26) 

g{k n) 



G{A,B,yi,x') = lim 1, x, x' 

71— >00 

_ ii^cicrixr' ■ A) 



E[a°(Xi)])/3c 



cr(x') • Z^^^+h' G B) . (27) 



Let Tn{A,B,s) and t{A,B) be the Hermite ranks with respect to (X.i^h,^m,m+h') of the 
functions Gn{A, B, s, •, •) and G{A, B, •, •), respectively. Define t{A) = t{A,M.'^). 

Assumption 5. For large n, infg r„(^, s) = t{A^B) and t{A,B) < t{A). 

This assumption is fulfilled for example when (7{x) = exp(x), in which case all the considered 
Hermite ranks are equal to one, or if a is an even function with Hermite rank 2 (such as 
(t(x) = x^), in which case they are equal to two. The modification of Theorem 2 reads as 
follows. 

Theorem 4. Assume that {Xj} is the long memory Gaussian sequence with covariance given 
by (3). Let Assumptions 1, 2, 3, 4 cmd 5 hold, fJ-ciA) > and k/n — )• 0, ngc{k/n) — )• oo and 

lim \ngc{k/n) A Tn"^^'""^/') ^'n(^) = . (28) 

n— >oo L J 

(i) If ngc{k/n)-fn^'^'^^ -^0, then 



y/ngc{k/n)ficiA){pniA, B, m) - p{A, B, m)} 

converges to a centered Gaussian distribution with variance given in (25) 

(a) If ngc{k/n)^n^^'^^ oo, then 'yn^^^'^^^'^ {pn{A, B,m) — p{A,B,m)} converges weakly 
to a distribution which is non-Gaussian except ifT{A,B) = 1. 

The exact definition of the limiting distribution will be given in Section 4. It suffices to 
mention here that this distribution depends on H and t{A,B). The meaning of the above 
result is the following. In the long memory setting, it is still possible to obtain the same limit 
as in the weakly dependent case, if k (i.e., the number of high order statistics used in the 
definition of the estimators) is not too large, so that both the bias and the long memory effect 
are canceled. 

Define a new Hermite rank t*{A) = inf^gj^^z+i t{A, (oo,y]). 
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Corollary 5. Under the Assumptions of Theorem 4, if the distribution function ^A,m,h' is 
continuous and ifT*{A) < t{A), then 

• If ngc{k/n)jn ^"^^ 0, then 

^/ngc{k/n)flc{A){'^n,A,■m,h' - ^A,m,h'} 

converges in T>{{—oo, +00)'''+^ to a Gaussian process. Ifh = 1 or if the functions Cj are 
identically zero forj = 2, . . . ,h, then the limiting process can be expressed as Bo^I/^ „j ^^/^ 
where B is the standard Brownian bridge. 

• Ifngc{k/n)-fn ^"^^ 00, thenjn'^ ^^^^'^{'^n,A,m,h'-'^A,m,h'} converges inV{{-oo, +00)^' 
to a process which can be expressed as JA,m,h''^ where JA,m,h' is a deterministic function 
and 'R is a random variable, which is non Gaussian except if t* (A) = 1 . 

The exact definition of the function JA,m,h' and of the random variable H will be given in 
Section 4. Anyhow, they are not of much practical interest. In practice, the main goal will 
be to choose the number k of order statistics used in the estimation procedure so that both 
the bias and the long memory effect are canceled, and the limiting distribution of the weakly 
dependent case can be used in the inference. 

3.3 Examples 

We now discuss the Examples introduced in Section 2.1. In order to evaluate the rate of 
convergence (22), it is necessary to introduce a second order regular variation condition. We 
follow here Drees [1998]. 

Assumption 6. There exists a bounded non increasing function rj* on [0, 00), regularly vary- 
ing at infinity with index —a( for some C ^ 0, and such that limt-jfoo V* (t) = and there 
exists a measurable function rj such that for z > 0, 

P(Z > z) = cz~'^ exp (^j'^ ^ ds 
3C7 > , Vs > , |r?(s)| < Cif{s) . 

On account of Breiman's lemma, if the tail of Z is regularly varying with index —a, then 
the same holds for Y = a{X)Z, as long as X and Z are independent, and E[(j"(X)] < 00. 
Also, (SO) property is transferred from the tail of Z to Y; See [Kulik and Soulier, 2011, 
Proposition 2.1]. 

For the sake of simplicity and clarity of exposition, we will make in this section the usual 
assumption that cr{x) = exp(x), so that the Hermite rank of a is 1. This will avoid to define 
many auxiliary functions and Hermite ranks. But the examples can of course be treated in a 
more general framework. Also, we will only state the convergence results under the conditions 
which imply that the limiting distribution is the same as in the weak dependence case, since 
this is the case of practical interest. We only treat Examples 1 and 3 since they exhibit 
the two different possibility for the limiting distributions. The computations for the other 
examples are straightforward. 
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3.3.1 Example 1 continued 

Fix integers h > 1 and m > h. Recall the formula (17) for the conditional distribution of Ym 
given that Yi, . . . ,Yh are simultaneously large. Its estimator ^n,h is defined by 

En 1 
/ 1 _ j = l ^{Yj>Yi^„.„„^,),...,Yj + h-l>y{n:n~k)Xj + 7n<y} 

^n,h{y) F=m 

with a user chosen k. 

Assumption 2 holds with Cj{A, ■) = 0, j = 2, . . . ,h. Assumption 6 and [Kulik and Soulier, 
2011, Proposition 2.8] imply that if moreover 



E 



h 



,i=l 



< oo (29) 



for some e > 0, a bound for VniA) is then given by 

VniA)=0{7]*{Un)) . (30) 

The moment restriction (29) is quite weak. In particular, it is fulfilled for cr(x) = exp(x); see 
Remark 3. Recall that in this example Assumption 1 and 2 hold and the functions Cj therein 
are vanishing for j > 2. Also, Assumption 3 is implied by (29). 

Corollary 6. Assume that a{x) = exp(x). Let Assumption 6 and (29) hold. Let k be such 
that k/n — t- 0, n{k/n)^ — )■ oo, and 

lim {n{k/nfY!'^r]*{un) = . (31) 

In the weakly dependent case (2) or in the long memory case (3) if moreover n{k/n)^^n 0, 
then 

weakly in 'D((— oo, oo)), where B is the standard Brownian bri 



3.3.2 Example 3 continued 

Consider the estimation of 

An estimator if defined by 

E"=ll{y,+lS+i>Y-(„:„_fc)}l{K,+™<S/} 



Kiy) 



Si = l '^{Yj+yj + l>Y{u:n-k)} 
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We have already shown that Assumption 1 holds and Assumption 4 holds trivially. Assump- 
tion 2 holds with the function £2 defined by 

fl + s l + s'\~°' 

£2(^,'Wl,M2,Vl,V2,S,s') = V . (32) 

\ U2 vi J 

If E[cr^°^^"'"^)"^^(Xi)] < 00, then Assumption 3 holds and applying Lemma A.l, we obtain a 
bound for Vn{A): 

Vn{A) = O (ri*{un) + / " Fz{s) . (33) 

as soon as 







(34) 

Corollary 7. Let Assumption 6 and (29) hold. Let k be such that k — ?• 00, k/n — ?• and 

lim A;i/2 (n*iun) + Fz{s) ds^ = . 

In the weakly dependent case (2) or in the long memory case (3) if moreover k^n 0, then 

_ ^ / E[.-(X,) + .-(X2)] \-^/^ 
^ ^> ^ V IEk"(^i)] J 

weakly in 'D((— 00, 00)), where W is a Gaussian process with covariance 

cov(W(y),W(y')) = A(yAy')-2A(y)A(yO 

E[a''{X2){Fz{y/a{X^))Fz{y'/a{Xm+i)) + Fz{y/a{Xm))Fz{y'/a{X,^+i))}] 



+ 



E[a°(Xi) + cT°(X2)] 



Remark 5. If the estimator if modified by taking only every other observation, then \/fc(A„— A) 
converges weakly to 2B o A where B is the standard Brownian bridge. 



4 Proofs 

For clarity of notation, denote = a{Xi), g = gc, T = Tc and /3 = /3c- Recall that Fy 
denotes the distribution function of Y and n„ = (1 / Fy)'^ {n / k) . By (4) and the regular 
variation of g, it holds that Fy{un) ~ E[c7q ]-Fz('u„) and 

n^oo g{Fz{Un)) 

Whenever there is no risk of confusion, we omit dependence on h, m, h' and A in the notation. 
For j = 1, . . . , n, define the following random variables 

Wj,nis) = l{Y,,,+,_ien„sA} , ^ > 1 , Vj{B) = 1{Y^.^„,^.^„^,, eB} • (35) 
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Assumption 1 together with the choice of u„ imphes that (recah the definitions (13) and (21) 
of p{A, B,m) and /ic(A)), 



n-i>oo g[k/n) 
n->-oo g[K/n) 

Define, for s > 1 and x G M'^ and x' G R'*'"'"^, the functions L„ and G„ by 

. , , _ F(cr(x) • Zi,fe £ UnsA) 
g{k/n) 

Gnis,x,x',B) = L„(s,x) P(cr(x') • Z„,„+ft, G B) . 
With these notations, we have, 

Ln{s, Xjj+/j_i) 
Gn{s, Xjj_|_/j_i, Xj_|_mj_|_m+ft', -B) 

For X G M^, denote 



E[T4^,-„(s) I X] 
g{k/n) 

g{k/n) 



L(x) 



z.c(>t(x)'1 • A) 



(o-(x) • Zi,/i G UnSA) 



(36) 
(37) 



(38) 
(39) 



(40) 



so that E[L(Xi,ft)] = iJic{A). 
Proof of Lemma 1. Write 
L„(s,x) - Tc(s)L(x) 

I g{k/n) J 5f(Fz(u„s)) 

+ {EK{X)])-^Tc{s) | ncx(x) _Z ,G nn.A) _ (je[,«(x)])^^(x) 

I 9{Fz{UnS)) 

Thus, recalhng the definition of f„ from (22), we have 



Vn{A) < sup 
s>l 



9^!j^-ina%X)])-f'Tc(s) 

g{k/n) 



E[M^(cr(Xi,0)] 



+ (E[a"(X)])-''E 



sup 

.s>0 



P((t(Xi,,,) • Zi,^ G UnsA I Af) 



(EK(X)])'^L(Xi,,: 



g{Fz{uns)) 



By Assumption 1, for all x G M'^, 

(cr(x) • Zi,/i G UnSA) 



lim sup 

S>1 



g{Fz{uns)) 



- (EK(X)])'^L(x) 



. 
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Moreover, by (11), 



sup 

S>1 



P(cj(x) • Zi,/, G UnSA) 



g{Fz{uns)) 

Thus, by Assumption 3 and bounded convergence. 



EK(X)])^L(x) 



< 2MA(cr(x)) . 



lim E 

n— >oo 



sup 

s>0 



(EK(X)])/^L(Xi,,) 



g{Fz{uns)) 



. 



Since g o F \s regularly varying at infinity with negative index, by [Bingham et al., 1989, 
Theorem 1.5.2], the convergence of g{Fz{uns))/g{k/n) to (E[a'^(X)])~^Tc{s) is uniform on 
[1, oo). Thus we have proved that VniA) — )• 0. □ 

Proof of Theorem 2. Define 

1 " 

K{B,s) = Tc{s)fic{A)p{A,B,m) , Kn{B,s) = ^ Wj,n{s)Vj{B) , 

ng[K/n) 



n 

en{s) = KniM.^'+\ s) = ^i'-(^) ' 



ng{k/n) ^ 



{n:n—k) 



With this notation, we have 



Pn{A,B,m) 



Equations (37) and (36) imply, respectively, that 

lim E[kn{B, s)] = K{B, s) lim E[g„(s)] = T{s)tic{A). 

n— ^oo n— ^oo 

With this in mind, we split 



PniA, B, m) - p{A, B, m) 

kn{B,in)-K{B,in) p(.A,B,m) 



Cn('?n 



{en(e„)-/ic(^)Tc(e«)} . (41) 



Thus, we only need to find the correct norming sequence Wn and asymptotic distribution in 
P([a, b]) for any < a < 6 of the sequence of processes Wn{Kn{B, •) — K{B, •)}. To do this, 
define further 



Kn{B,s)=E[kn{B,s)] . 



(42) 



Then 



kniB, s) - K{B, s) = kn{B, s) - Kn{B, s) + Kn{B, s) - KiB, s) . 
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The term Kn{B, s) — K(B, s) is a deterministic bias term that wih be dealt with by the second 
order condition (24). Write Kn — Kn = {ng(k/n))~^^'^ En^i + -En, 2 with 

En,i{B,s) = ^^ j2^W,,n{s)V,iB) -E[W,,r.{s)V,iB) I X]} , (43) 

1 " 

En,2{B,s) = y^nW,As)Vj{B) I X\-Kn{B,s) 



1 

= - y^{G'n(g, Xjj+fe-l, Xj_|_mj+m+fe'i -B) - Kn(B,s)) . (44) 

J=l 

The term in (43) wih be cahed the i.i.d. term. It is a sum of conditionally independent 
random variables. The term in (44) will be called the dependent term. It is a function of the 
dependent vectors (Xjj+/i_i, Xj+mj+m+fe')- 

We now state some claims whose proofs are postponed to the end of this section. The 
implication of Claims 1 and 3 is, in particular, that in the weakly dependent case only the 
i.i.d. part contributes to the limit. 

Claim 1. The process En,i converges in the sense of finite- dimensional distributions to a 
Gaussian process W with covariance 

{E[a"{Xi)]fcoY{W{B, s), W{B', s')) 

= E[£i(^,cr(Xi,;,),cr(Xi,0,s,s') x P(Y„,„,+^, G B,Y„,^rn+h' € B' \ X) 

h/\(m—h) 

+ ^ E £j(Ao-(Xi,h),o-(Xjj 

+h~l), S, S ) 

X {P(Ym,m+/i' G B,Ym+j-l,m+h'+j-l & B' \ X) 

+ ^O^m,rn+h' ^ B' ,Yrn+j-l,m+h'+j-l & B \ X)} , (45) 

where the functions Cj are defined in Assumption 2. 

Claim 2. For each fixed B, En^i{B, •) is tight in P([o, b]) for each < a < b. 
This claim is proved in Lemma C.3. 

The previous two statements are valid in both weakly dependent and long memory case. The 
next one may not be valid in the long memory case. See Section 3.2. 

Claim 3. In the weakly dependent case En,2{B,-) = Op{y/n), uniformly with respect to 
s S [a, b] for any < a < b. 

The next claim is proved in [Kulik and Soulier, 2011, Corollary 2.4]. 
Claim 4. C„ - 1 = op(l). 

The last thing we need is the negligibility of the bias term. 

Claim 5. For any a> 0, sups>aSupp \Kn{B,s) — K{B,s)\ = 0{vn{A)). 



17 



Therefore if ng[k/n) — )■ oo and (24) holds (i.e. ng{k/ri)Vn{A) — )• 0), then 

^ng{k/n){Kn{B, •) - K{B, •), e„(-) - i^(]R^ •)} ^ (^(i?, 0, W{m!''+\ •)) . 

This convergence and the decomposition (41) imply 

./^^^^{kJn)i^{pn{A, B, m) - p{A, B, m)} W{B, 1) - p{A, B, m)W{R''''+\ 1) . 

This distribution is Gaussian. Applying (45) and the fact that p(A, M^^'^^, m) = 1, it is easily 
checked that its variance is given by (25). This concludes the proof of Theorem 2. □ 

We now prove the claims. 

Proof of Claim 1. For j = 1, . . . , n, denote 

CnAB,s) = —l==W,,n{s)V,{B) . 

yjng(k/n) 

In order to prove our claim, we apply the central limit theorem for m-dependent random 
variables, see Orcy [1958]. Let C{B,B',s,s') denote the quantity in the right hand side 
of (45). We need to check that 

(n n \ 

Y,CnAB,s),Y,CnAB',s')\x \ -^pC{B,B',s,s') , (46) 

n 

J2nCAB,s)\X]^pO. (47) 

i=i 

By standard Lindeberg-Feller type arguments, this proves the one-dimensional convergence. 
The finite-dimensional convergence is proved by similar arguments and by computing the 
asymptotic covariances. We now prove (46) and (47). 

For u > 1, X, x' G M'^, denote 

^ / ,^ _ P(<t(x) • Zi^fe £ UnsA,o-{^') • Zu,u+h-l £ UnS'A) 

g{Fz{u.n)) 

For 1 < u < /i, by Assumptions 1 and 2, the functions converge in {'X.i^h,^u,u+h~i) 
to the functions Cu defined in Assumption 2. For u > h, Zi^^j and Z^^^^h-i are independent, 
so Cn,u converges a.s. and in {'X.i^h,^u,u+h-i) to 0. 

The random variables (^n,j sue m + h' dependent. Thus, 

(n n \ n 

j=i i=i / j=i 

n m+h' 

+ E E COv(Cn,i(i?, S), Cn,j+u{B', s') \ X) (48) 

i=i u=i 

n m+h' 

+ E E <^oACn,j+u{B,s)XnAB',s) \ X) . (49) 

j=l u=l 
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For u = 1, . . . ,h A {m — h) it is easily seen that 

n 



ng{k/n) ^ 

X P(Yj+,;„j-+m_|_/i G B,Yj^u+m,j+u+m+h' ^ B' \ X) 



This yields the right-hand side of (45), so we must prove that the terms in (48) and (49) are 
negligible. U h > m — h, then for large n and m — h < u < h, we have (uns'A) Ci B = 0, so, 
for all J = 1 . . . , n, 

^{^j,j+h-l e UnSA,Yj+u,j+u+h-l G Uns'A, 

^j+m,j+m+h G B,Yj^u+m,j+u+m+h' ^ B' \ X^ = . 

For u > h, then as mentioned above, Cu{A, •, •, s, s') converges to in L^p^i^h, ^u,u+h~i) so 

n 

J^COv(C„j(5,s),Cnj+„(S',s') I ;f) . 

This proves (46). Next, since Cn,j are indicators and applying (37) 

This proves (47) and the weak convergence of finite dimensional distributions. □ 

Proof of Claim 3. By definition of the functions L„ and Gn (cf. (38) and (39)), it clearly 
holds that 

\Gn{s-,^j,j+h-l,^j+m,j+m+h' ^ B)\ < L„ (s, Xj j+/i_i) . 

We apply the variance inequality (B.3) in the weak dependence case to get 

C 1 
var(^„,2(B,s)) < —var(G„(s,Xi /„Xi+„ B)) < -E[L^(s, Xi /,)] . 

n n 

By (11), L„(s,x) < M^((t(x)). Thus, by Assumption 3, the right hand side is uniformly 
bounded, thus var(£'„^2(-B, s)) = 0{l/n) and for any fixed s > 0, ^/nEn^2{B,s) = Op{l). 
Tightness follows from Lemma C.4, thus En^B, •) converges uniformly to on any compact 
set of (0, co]. □ 

Proof of Claim 5. Consider now the bias term Kn — K. Recall that (see (42) and (37)) 
Kn{B, s) = E[Kn{B, s)] ^ Tc{s)fic{A)p{A, B, m) = K{B, s) 
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Therefore, Kn(B,s) converges pointwise to K(B,s). The goal here is to show that this 
convergence is uniform. Using the definition of Kn, (38) and (39) we have 

Kn{B,s) = E[G„(s,Xi,/,,Xm,m4.ft/,S)] = E[L„(s,Xi,/i)P((7(Xm,m+/i') • 7^m,m+h' ^ B \ X)] . 

Using this definition and recahing the formula for p{A, B,m) (see (13)) 

K{B,s) = rc(s)E[L(Xi,/,)P(a(X^,„+,,0 • 2^,^+^, e B \ X)] . 
Therefore, recalling the definition (22) of Vn{A), we obtain that 



\Kn{B,s)- K{B,s)\ <E 



sup|L„(s,Xi^/j) - Tc{s)L(Xi^h) 

S>1 



Vn{A) . 



□ 



Proof of Corollary 3. In the following, y stands for the set (— oo,y] in the previous notation. 
For y G M'^ rewrite the decomposition (41) in the present context to get 

^.(y) - ^(y) = - ^Aen{in) - ,c{A)TciCn)} • 



^n(Cn) 



Thus we need only prove that the sequence of suitably normalized processes Kn{s,y) — 
Kn{y,s) converge weakly to the claimed limit. The convergence of finite dimensional distri- 
butions follows from Theorem 2 and the tightness follows from Lemmas C.3 and C.4. □ 

Proof of Theorem 4- Claims 1, 2, 4 and 5 hold under the assumptions of Theorem 4. Thus, 
the result will follow if we prove a modified version of Claim 3. 

Claim 6. //2r(A,S)(l - H) < 1, then 'j''"^'^'^^^^ En,2iA, B, ■) converges weakly uniformly 
on compact sets of (0, oo] to a process Tq ■ Z{A, B) where the random variable Z(A, B) is 
in a Gaussian chaos of order t{A, B) and its distribution depends only on the Gaussian 
process {^n}- 

For any d G N*, q G N"' and x G M'^, denote 



Hq(x) 



i=l 



Define Xj = (Xj+i, . . . , Xj^^, Xj^m, ■ ■ ■ , Xj^m+h')- The Hermite coefficients of G„(s, •) and 
G with respect to Xq can be expressed, for q G f^^^^ as 

J„(q,s) = E[Hq(Xo)G„(s,Xo)] , J(q) = E[Hq(Xo)G(Xo)] . 

Since Gn{s, •) converges to T{s)G{-) in L^^Kq) for somep > 1, Jn(q; s) converges to Tc(s) J(q). 
Let U be an (/i+Zi' + l) x (/i + Zi' + l) matrix such that UU' is equal to the inverse of the covari- 
ance matrix of Xq. Define J*(q, s) = E[Hq(C/Xo)G„(s, C/Xq)] and J*{q) = E[Hq([/Xo)G(Xo)]. 
Under Assumption 5, the function G„ can be expanded for x G 



h.+h'+i 



as 



G„(s,x)-E[G„(s,Xo)] = Yl 

h\=riA,B) 



Hq(?7x) + r„(s,x) , 
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where r„ is implicitly defined and has Hermite rank at least t{A, B) + 1 with respect to C/Xq. 
Denote Rn{s) = Z]j=i ''"nisj^j)- Applying (B.3), we have 

var < C (7;^^'''^+' V var(G„(s,Xo)) < C (7;^^'''^+' V E[L,2 (s, Xi,^] . 

By Assumption 3, E[L^(s, Xi^/^)] is uniformly bounded, thus var(i?„,(s)) = 0(7^^"^'^^) and 
^n^^^'^^ Rn{s) converges weakly to zero. The convergence is uniform by an application of 
Lemma C.l. 

Thus, the asymptotic behaviour of ^n'^^^'^^^'^ En^2 is the same as that of 

|q|=r(A,B) ^' j = l 

By [Arcones, 1994, Theorem 6], there exist random variables H*(q) such that Zn{s) converges 
to 

\q\=T{A,B) 

for each s > 0. To prove that the convergence is uniform, we only need to prove that Jj^(q, •) 
converges uniformly to Tc ■ J*(q) for each q such that |q| = t{A). Since the coefficients J* can 
be expressed linearly in terms of the coefficients J„, it suffices to prove uniform convergence 
of the coefficients Jn- Applying Holder inequality, we obtain, for p > 1 and for any a > 0, 



sup|J„(q,s)-Tc(s)J(q)| < CE 
s>a 



sup |L„(s,Xi,ft,) - Tc{s)L(Xi^h)f 

s>a 



We have already seen that this last quantity converges to for p = 2 by Assumption 3. □ 

Appendix 

A Second order regular variation of convolutions 

Denote A ^ B ii there exists positive constant ci and C2 such that ciA < B < C2B. 

Lemma A.l. Let Z\ and Z2 be i.i.d. non negative random variables with common distribution 
function F that satisfies Assumption 6. Then 

P(niZi + U2Z2 >t)- F{t/ui) - F{t/u2)\ < Cu'^+'u'^+' t-^F{t) [ F{s) ds . 

Jo 

Proof. Obviously, we have 

F{uiZi + U2Z2 >t) = F{t/ui) + F{t/u2) - F{t/ui)F{t/u2) 
+P(t/2 < uiZi < t)¥{t/2 < U2Z2 < t) 
+¥{uiZi < t/2,U2Z2 < t,uiZi + U2Z2 > t) 
+P(U2^2 < t/2,uiZi < t,UiZi + U2Z2 > t) . 
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Consider for instance the second last term. It may be written as 



h :=IE 



^{uiZi<t/2} 



\ F{t{l-uiZ^/t)/u2) ^ 

I F{t/U2) 



Since F satisfies Assumption 6, we have, for u G [1/2, 1], 

/■u rjits) 1 ru r)(ts) i r}(ts 

^ ^ 1 - 1 = {n-" - l}e/i + e/i . 

< \u — 1 e-'i/^ +e''i/2 *i / — ^ — -as. 



- Fit] 



Since r]*(t) is decreasing, we have, for all u € [1/2, 1] 



< 



1 < - 1| + log(u)} < C(l - tx) . 



Applying this inequality with 1 — u = uiZi/t on the event uiZi < t/2 yields 

h < Cuit-^E [Zil{u,z,<t}] < Cr^ f'"' F{s) ds = Cr\^^ f F{s/ui) ds 

Jo Jo 

By Potter's bounds, for any e > 0, there exists a constant C such for any s,t > 0, 



F{s/ui 



F{s) 



Applying this bound we obtain 



h < C{ui V 1)"+'(U2 V lY+H-'F{t) / F{s) ds . 

Jo 

To conclude, note that ^^(t) = 0{t~^F{t) F{s) ds) if a < 1 and F^{t) = o{t~^F{t) F{s) ds) 
if a > 1. □ 

Remark 6. By induction, we can obtain the bound 



(Zi + • • • + Z„ > t) - nF{t) I < C t-^F{t) / F{s) ds 



and we can also recover a particular case of a result of Omey and Willekens [1987] in a slightly 
different form. For q > 1 and E[Zi] < oo. 



hm ,[nZ.+yj + Zn>t) _ ^1 ^ 



t^oo \ P(Zi > t) 
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B Multivariate Hermite expansions and variance inequalities 
for Gaussian processes 



Consider a multidimensional stationary centered Gaussian process {X„} with autocovariance 
'■0 



function jnihj) = ^[Xq^ Xn^] and assume either 



Vl<^,j<d, ^|7n(i,j)| <oo, (B.l) 



n=0 



or that there exists H G (1/2, 1) and a function i slowly varying at infinity such that 

1™ Ih^VJ , = hj , (B.2) 

and the coefficients hi^j are not identically zero. Then, we have the following inequality due 
to Arconcs [1994]. 

For any function G such that E[G^(Xo)] < oo and with Hermite rank q with respect to Xq, 

var (^-^ ^ (^(Xj) j < C{P{n)n^'i'^^-^^) V n"^ var(G(Xo)) . (B.3) 

where the constant C depends only on the Gaussian process {X„} and not on the function G. 
This bound summarizes Equations 2.18, 3.10 and 2.40 in Arconcs [1994]. The rate obtained 
is in the weakly dependent case where (B.l) holds and in the case where (B.2) holds and 
G has Hermite rank q such that q{l — H) > 1. Otherwise, the rate is l'^{n)v?'^^^~^\ 



C A criterion for tightness 

We state a criterion for the tightness of a sequence of random processes with path in 'D(]R'^), 
which adapts to the present context Bickcl and Wichura [1971, Theorem 3] and the remarks 
thereafter. 

Let T be a rectangle T = Ti^T^ <ZW^ . A block i? in T is a subset of T of the form rif=i('^«' 
with Si < ti, 1 < i < d. Disjoint blocks B = nf=i(^«i*«] ^^"^ ^' — Y\d=ii^'v^'i\ neighbours 
if there exists p € { 1 , . . . , d} such that s'p = tp or Sp = t'p and Si = s[ and ti = for i ^ p. 
(In the terminology of Bickcl and Wichura [1971] the blocks B and B' are said to share a 
common face.) Let X be a random process indexed by T. The increment of the process X 
over a block B = Y]^=\{si,ti] is defined by 

X{B)= Yl {-lf-^'^=^''X{si + ei{h-si),...,Sd + ed{td-Sd)) . 

(ei,...,ed)e{0,l}d 

(This is the usual d-dimensional increment of a random process X. If for instance d = 2, then 
X{B) = X{ti,t2)-X{ti,S2)-X{si,t2) + X{si,S2)). If a: is an indicator, i.e. A(y) = l{Y<y} 
for some T valued random variable Y, then X{B) = l{YeB}- 



23 



Lemma C.l. Let {Cn} be sequence of stochastic processes indexed by a compact rectangle 
T C M*^. Assume that the finite dimensional marginal distributions of Cn converges weakly to 
those of a process C which is continuous on the upper boundary ofT. Assume moreover that 
there exist 7 > and f3 > 1 such that 

mn{B)\ A |C„(i?')l > A) < CX~m[ii^^{B U B')] (C.l) 

for some sequence of random probability measures (in which converges weakly in probability to 
a (possibly random) probability measure fj, with (almost surely) continuous marginals. Then 
the sequence of processes {Cn} is tight mP(r, M). 

Sketch of proof. For / defined on T = Ti x • • • x T^, i £ {I, . . . ,d} and t £ Ti, define /j*^ on 
Ti X • • • X X T,+i X • • • X Td by 

ft^\h, ■ ■ ■ , ti-i,ti+i, . . . ,td) = f{ti, . . . , ti-i,t, tj+i, . . . ,td) 
and define, for s < t G and 6 > 0, 

w'!{f,S,t) = sup ||/« - /«||oo A ll/i*) - /«||oo , 
s<u<v<w<t 

wUf,S) = sup ||/« - /«|U A ||/« - /«|U . 
u<v<w<u+S 

By the Corollary of Bickcl and Wichura [1971], a sequence of processes {-^n} defined on T 
converges weakly in P(T) to a process X which is continuous at the upper boundary of T 
with probability one, if the finite-dimensional marginal distributions of X„ converges to those 
of X and if, for all 5, \ > 0, and al i = 1, . . . ,d, 

F{w'/{Xn,S)> X)^0. (C.2) 

For any measure /i on T, define its i-th marginal ^u^*^ by 

fi^'\{s,t]) = fi{Ti X ••• xTi.i X {s,t] xTi+i X ••• xTd) eT^ . 

As mentioned in the remarks after the proof of Bickcl and Wichura [1971, Theorem 3], an 
easy adaptation of the proof of Billingslcy [1968, Theorem 15.6] shows that (C.2) is implied by 

n<{Xn.s,t) > A) < CA-^E[{^«(s,t])}^] , (C.3) 

where Hn satisfies the assumptions of the Lemma. So we must show that (C.l) implies (C.3). 
The proof is by induction, so the first step is to prove it in the one-dimensional case, where 
(C.l) becomes, for u < v < w £ T, 

niUv) - Uu)\ A \Uw) - Cn{v)\ > A) < CX-^nfi^^{{u,w])] . (C.4) 

The proof of (C.3) under the assumption (C.4) follows the lines of the proof of [Billingsley, 
1968, (15.26)] under the assumption [Billingsley, 1968, (15.21)]. The key ingredient is the 
maximal inequality [Billingslcy, 1968, Theorem 12.5], which can be easily adapted as fol- 
lows in the present context. Let Sq, . . . ,Sn be random variables. Assume that there exists 
nonnegative random variables ui , . . . , ti„ such that 

P(|5, - Sj\ A \Sk - 5j| > A) < X-m[{ui + • • • + Uk)!^] 
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for some /? > 1 and 7 > and all 1 < i < j < /c < n and, then there exists a constant C that 
depends only on /? and 7 such that 



max 

l<i<j<fc<n 



\Si -Sj\^ \Sk -Sj\> \^ < c\^^¥.[{ui + --- + U. 



Proving by induction that (C.l) implies (C.3) in the d-dimensional case can be done exactly 
along the lines of Step 5 of the proof of Bickel and Wichura [1971, Theorem 1]. □ 

In order to apply this criterion to the context of empirical processes, we need the following 
Lemma which slightly extends the bound Billingsley [1968, (13.18)]. 

Lemma C.2. Let {{Bi,B'^)} be a sequence of m- dependent vectors, where Bi and B[ are 
Bernoulli random variables, with parameters pi and qi, respectively, and such that BiB[ = 
a.s. Denote Sn = Yl^j=i{^j ~ Pj) ^^'^ '^n — Sj=i(-^j' ~ Qj)- Then, there exists a constant C 
which depends only on m, such that 

E[SlS'„'] < C (j2p}j (j2q}j < C {^P^yq^ . (C.5) 

Proof. We start by assuming that the pairs {Bi, B'-) are i.i.d. and we prove (C.5) by induction. 
For any integrable random variable X, denote X = X — E[X]. For n = 1, since BiB'^ =0, 
we obtain 'E[BiB'j\ = —piqi and 

nslB'l] = E[(5i - 2piBi +pI){B[ - 2qiB[ + ql)] 

= Piq^ +piqi - Spjq"^ = piqi{pi + qi - 3pigi) < piqi . 

The last inequality comes from the fact that BiB[ = a.s. implies that Pi + qi < 1, and 
<p + q — 3pq < p + q < 1 for all p,q > such that p + q < 1. Assume now that (C.5) holds 
with C = 3 for some n > 1. Then, denoting s„ = Yl^=iPj ^'n = Sj=i Ij^ have 

= E[SlS'^'] + E[SlnB'l^,] + E[S'^']E[Bl^,] + 4E[Sr,S'jE[Bn+iB'^+,] + E[Bl^,B'l^,] 

n 

SnQn+l + s'nPn+l + ^Pn+lQu+l ^^PiQi +Pn+lQn+l 

1=1 

< ^SnS'n + ^Snqn+l + 3s^Pn+l + Pn+lQu+l < ^Sn+is'^^i . 

This proves that (C.5) holds for al n > 1. 

We now consider the case of m-dependence. Let Oj, 1 < i < n be a sequence of real numbers 
and set Oj = if z > n. Then 

(n \ 2 / m In/m] \ ^ m /in/rn] 

i=l / \q=l j = l I q=l \ j=l 
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Applying this and the bound for the independent case (extending all sequences by zero after 
the index n) yields 

m m \n/rn\ [n/m] 

^S^S'^^] < 3?n2 X] X] X] X] P{j-l)m+qPij'-l)m+q' = Sm^S^S^ . 
<?=lq' = l j = l j' = l 

□ 

Let us apply this criterion in the context of section 3. Fix a cone C and a relatively compact 
subset A £ C. Recall that En^i and En^2 are defined in (43) and (44). 

Lemma C.3. Under the assumptions of Theorem 2 or 4, for any fixed B G M'''+^ En,i{B,-) 
is tight in T>{[a,b]), and if moreover ^A,m,h continuous, then E^^i is tight in T>{IC x [a,b]) 
for any < a < b and any compact set K, ofR^ . 

Proof. By Assumption 4, if s < t, then tA C sA. Thus, a sequence of random measures fin 
on M'^ X (0, oo) can be defined by 

n . ^^ 1 A P( Y. G sn„yl I A:") ^ 

/i„((-00,y] X (S,00)) = - > . ¥{Yj+rn,j+m+h' < Y \ ^) 

n g[K/n) 
1 " 

= ~ X/ ^"(•^' ^j,h^j+m,j+m+h' ,y) , 

where G.„ is defined in (39). Then fin converges vaguely in probability to the measure 
defined by 

n{{-oo,y] X (s,oo)) = nc{A)T{s)'^A,m,h{y) ■ 

Then, by conditional m-dependence, for any neighbouring relatively compact blocs D,D' of 
X (0,cx)], applying Lemma C.2 yields 

E[El,{D)El^iD') I X] < Cfin{D)fin{D') . 

Taking unconditional expectations then yields 

nEl4D)El^iD')] < CniJin{D)fin{D')] < nfiliD^D')] . 

Thus (C.l) holds with /3 = 7 = 2. In the context of Theorem 2, for any fixed B, this 
implies that En,i{B,-) is fixed, since the limiting distribution is proportional to T{s) which 
is continuous. If the distribution function ^ is assumed to be continuous, then Lemma C.l 
applies and the process En,i is tight with respect to both variables. □ 

Lemma C.4. Under the assumptions of Theorem 2, for any fixed B G IR^'+\ En,2{B,) 
converges uniformly to zero on compact sets o/(0,oo]. Under the assumption of Corollary 3, 
En,2 converges uniformly to zero on compact sets ofR.^~^^ x (0,oo]. 
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Proof. We only need to prove the tightness. By the variance inequahty (B.3) and Holder's 
inequality, we have, for any relatively compact neighbouring blocks D,D' of R'^ x (0,oo), 



n\E2,nm A \E2,n{D')\ > A) < X'^ ^E[El^XD)nEln{D')] < \-'E[El^{DUD')] 

where jln is the random measure defined by 

g{k/n) 

Assumptions 1 and 3 imply that /i„ converges vaguely on M"^ x (0,oo], in probability and in 
the mean square to the measure fi defined by 

A((-oo,y] X (s,oo]) = ^cMXi,,) i-A) <y\X). 

The measure fi has continuous marginals if we consider the case of a fixed B (which takes 
care of Theorem 4). The marginals of fi are almost surely continuous if Fz is continuous, so 
Lemma C.l applies. □ 
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